CAT: the CELCT Annotation Tool
نویسندگان
چکیده
This paper presents CAT CELCT Annotation Tool, a new general-purpose web-based tool for text annotation developed by CELCT (Center for the Evaluation of Language and Communication Technologies). The aim of CAT is to make text annotation an intuitive, easy and fast process. In particular, CAT was created to support human annotators in performing linguistic and semantic text annotation and was designed to improve productivity and reduce time spent on this task. Manual text annotation is, in fact, a time-consuming activity, and conflicts may arise with the strict deadlines annotation projects are frequently subject to. Thanks to its adaptability and user-friendly interface, CAT can positively contribute to improve time management in annotation project. Further, the tool has a number of features which make it an easy-to-use tool for many types of annotations. Even if the first prototype of CAT has been used to perform temporal and event annotation following the It-TimeML specifications, the tool is general enough to be used for annotating a broad range of linguistic and semantic phenomena. CAT is freely available for research purposes.
منابع مشابه
LaBB-CAT: an Annotation Store
“ONZE Miner”, an open-source tool for storing and automatically annotating Transcriber transcripts, has been redeveloped to use “annotation graphs” as its data model. The annotation graph framework provides the new software, “LaBB-CAT”, greater flexibility for automatic and manual annotation of corpus data at various independent levels of granularity, and allows more sophisticated annotation st...
متن کاملCoreference Annotation Scheme and Relation Types for Hindi
This paper describes a coreference annotation scheme, coreference annotation specific issues and their solutions through our proposed annotation scheme for Hindi. We introduce different co-reference relation types between continuous mentions of the same coreference chain such as ‘Part-of’, ‘Function-value pair’ etc. We used Jaccard similarity based Krippendorff‘s’ alpha to demonstrate consisten...
متن کاملFrom Text to Knowledge for the Semantic Web: the ONTOTEXT Project
This paper presents the general objectives of the ONTOTEXT project (From Text to Knowledge for the Semantic Web), and the activities carried out during the first year of its development cycle. First, the task of annotating huge amounts of textual data (e.g. those available on the Web or in local document collections) will be introduced, focusing on its importance in order to enhance the interop...
متن کاملThe PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification
The PhyloFacts 'Fast Approximate Tree Classification' (FAT-CAT) web server provides a novel approach to ortholog identification using subtree hidden Markov model-based placement of protein sequences to phylogenomic orthology groups in the PhyloFacts database. Results on a data set of microbial, plant and animal proteins demonstrate FAT-CAT's high precision at separating orthologs and paralogs a...
متن کاملContig annotation tool CAT robustly classifies assembled metagenomic contigs and long sequences
In modern-day metagenomics, there is an increasing need for robust taxonomic annotation of long DNA sequences from unknown micro-organisms. Long metagenomic sequences may be derived from assembly of short-read metagenomes, or from long-read single molecule sequencing. Here we introduce CAT, a pipeline for robust taxonomic classification of long DNA sequences. We show that CAT correctly classifi...
متن کامل